A NON-SPLIT SUM OF COEFFICIENTS OF MODULAR FORMS 



NICOLAS TEMPLIER 



Abstract. We shall introduce and study certain truncated sums of Hecke eigenvalues of 
GI/2-automorphic forms along quadratic polynomials. A power saving estimate is estab- 
lished and new applications to moments of critical L-values associated to quadratic fields 
are derived. An application to the asymptotic behavior of the height of Heegner points and 
singular moduli is discussed in details. 
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1. Introduction. 

Upper bounds for sums of arithmetic functions is a classical and central problem in analytic 
number theory. In this paper we shall introduce certain sums of coefficients of modular forms 
that may be used as variants of shifted convolution sums in certain circumstances. 

1.1. Main result. Let = {x+iy, ?/ > 0} be the Poincare upper-half plane. Let f : ^ C 
be a classical modular form of weight 2, trivial Nebentypus and odd squarefree level. Let 

oo 

(1.1) /(z) = ^nl/2A^(n)e2^™^ e ^ 

n=l 

be its normalized Fourier expansion at infinity. We shall establish the following estimate: 
Theorem 1. There are absolute constants ri,ri' > such that the bound 

(1.2) Yl ^fin' + d)<^fN'~^, 

N<n<2N 

holds uniformly for all couples {d, N) where d is a prime number with d = 3 (mod 4) and N 
is a positive number with d^^^"''' ^ ^ d^^'^^^' . 
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Remark 1. The left-hand-side is a sum of length N. The direct application of Deligne's 
bound |A/(n)| ^ T{n), where r is the divisor function, would yield the majoration <C NlogN. 
The bound (1.2) saves a small power of N. 

The typical example is when N = d}/'^ ^ oo. In the theorem we allow some more freedom 
for N because this flexibility is needed for applications and comes naturally from the method 
of proof. 

Remark 2. The exponents 77, r]' could be made explicit and are equal to the given 
in section 7.1 (we shall assume for simplicity ry = t/' in the sequel). Our approach is not 
well-suited to optimize the value of the exponents because it relies on a large number of 
transformations, each one carrying waste. 

Remark 3. Independently, V. Blomer [3] has established a result similar to (1.2) when d is 
fixed and — > 00. Here the constraint ^ d'^^'^'^'^ makes the length of the n-sum shorter. 

Remark 4. In the present article we do not work out the case d < 0. However let us recall the 
known case where d were the opposite of a perfect square, say. Then the quadratic 

polynomial n ^ •n? — h? would split and the left-hand side of (1.2) would essentially reduce 
to 

(1.3) Xf{n-h)Xf{n + h) 

N<n<2N 

(because of the multiplicativity of Aj). A. Selberg [58] was the first to study these sums. 
Producing a non-trivial estimate for (1.3) is the Shifted Convolution Problem (SCP) for two 
GL{2) forms, whose resolution is a cornerstone for many further developments ^ (see [39] for a 
good survey). This distinction between split and non-split polynomials justifies why we may 
call the left-hand side of (1.2) "a non-split sum". 

Remark 5. The first occurrence of a non-split quadratic polynomial in this kind of problem 
appears in a work of C. Hooley [27]. The result of that paper and further developments, 
notably [13], have had an important influence to the present paper. We refer to a forthcoming 
survey for a detailed discussion of the nexus; a key insight is that a consequence of Duke's 
Theorem [10] is the uniform distribution of 

(1.4) : u^ + d = (modq), ueZ/qZ, l^qi^d^^^} 

inside M/Z as d +00. This fact is not used explicitly in the proof of Theorem 1, but 
nevertheless lies in the background and has provided a guideline through our work. 

Remark 6. Although we did not state it explicitly, the proof of Theorem 1 is valid for 
modular forms / of arbitrary even weight 2k and odd squarefree leveP. The only change is 
that the Bessel function Ji is replaced by the Bessel function J^. The proof also works for 
Maass forms of odd squarefree level, although it yields statement (1.2) in its smooth version 
only (slightly weaker) because Deligne's bound is not available for Maass forms. 



In the classical SCP we may choose as small as h where 6 is the exponent towards Ramanujan-Petersson, 
which is to be compared with the assumption d}^^~^ ^ N in Theorem 1. 

^Note that we do not claim any precise bound in the weight nor level aspect. In this article all the constants 
involved in the bounds <C/ are polynomials (with a large exponent) in the weight and the level of /. 

''By "smooth version" we mean that X^jv<n<2jv replaced by V{n/N) where V is smooth (C°°) of 
compact support. 



3 



The condition that the level of / is odd and squarefree is a technical difficulty that simplifies 
the computations in sections 5 and 6. We expect that a variant of Theorem 1 would hold for 
all cuspidal automorphic forms on GL(2)q. 

Remark 7. Theorem 3 from section 7 provides a slightly more general version. The difference 
with Theorem 1 is on the restriction that d is a prime number. In Theorem 3, we allow d to 
be squarefree with all its prime factors > d^, where e > is fixed in advance. This assumption 
on d is a technical assumption that arises in the explicit computations from section 5. We 
expect that estimate (1.2) would hold for all positive integers d, see also the next remark. 

Remark 8. In Theorem 3, we also allow a square part, replacing d by de^ with e ^ 1, because 
it is needed for applications. The dependence on e is polynomial: e^^^\ It should be possible 
to obtain a sharp estimate in this parameter. This would involve a fine analysis at the finite 
places and should be closely related to a recent theorem of V. Vatsal [66] . We shall not discuss 
this interesting issue in the present paper. 

Remark 9. In a recent work of R. Holowinski [26], which relies on very different methods 
(sieve and partial results towards Sato- Tate), estimates that save a power of logA^ in sev- 
eral SCP of absolute values of Hecke eigenvalues are established. It would be interesting to 
investigate bounds for 

(1.5) |A/(n' + d)| 

N<n<2N 

(for instance with d fixed and without the constraint N ^ d^^'^'^^ in a first attempt). When 
—d is not a perfect square it is not clear how one could proceed. 

1.2. Moments of L-functions. Theorem 1 arises in the study of moments of L-functions 
associated to quadratic number fields. In this section we recall what is already known and in 
the next one we explain our new applications. Let D < be the discriminant of an imaginary 
quadratic field K = Q(\/^)- Let Od be the ring of integers and Clo the ideal class group. 
One may associate to unitary characters x S Cl/j on this group many interesting L-functions.^ 
It is important and challenging to determine asymptotically the average of the critical values 
of these L-functions. The average is with respect to x G Cl^i (one speaks of the moments of 
the family in the classical terminology introduced by [31]). Main examples are as follows: 

(A) The Hecke L-function L(s,x) is the most organic. The first and second moment of 
L(l/2,x) have been studied by Duke, J. Friedlander and H. Iwaniec [12,61]. Quanti- 
tative non- vanishing has been obtained by V. Blomer [2]. A subconvex bound in the 
L'-aspect has been established in [14]. 

(B) Let he a "canonical" Hecke character on Q{\/1D) of conductor \/1DOd (the ter- 
minology is from [57]). Consider the Hecke L-functions L{s,'ipx)j assume that 
the sign of the functional equation is +1. Quantitative nonvanishing of L{l/2,^px) 
has been studied by D. Rohrlich and others [38,44,45,51,55,56,68]. The asymptotic 
for the first moment has been computed by C. Liu, L. Xu, B Kim, R. Masri and 
T. Yang [32,32,35,37]. A subconvex bound in the L'-aspect follows from [14]. 

(C) Let / be a primitive modular form or a primitive Maass form. The L-series L(s, / x x) 
may be defined via the Rankin-Selberg method. A subconvex bound in the L'-aspect 
has been established in [24,42]. The sign of the functional equation is ±1. When the 

^In the sequel we always choose the unitary normalization for the L-series of principal automorphic forms 
tt: the functional equation links L{s,n) with L(l — s,7r), in particular the cntical line is 5Res = i. 
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sign is +1, the first moment of L(l/2, / x x) and the quantitative nonvanishing have 
been obtained by Ph. Michel and A. Venkatesh [41]. 
(D) Let L{s, f X x) be as in (C), but assume that the sign of the functional equation is — 1 
and / is holomorphic of weight 2. Partial results on the first moment of the special 
derivative L'{l/2,f x x) have been obtained by G. Ricotta and T. Vidick [49,50] (on 
average over D) and by Michel and Venkatesh [41] (under an unproven hypothesis), 
and by the author [63,65] (a lower bound for the first moment). 
In all four Cases (A-D) the conductor of the L-function is ~ (in Case (A) the second 
moment is the most relevant and the conductor of L(s,x)^ is -D^), the size of the family is 
h{D) the class number (which is roughly as D ^ — oo). The respective moments thus 

are: 

^ E Ml/2,/ XX); ^ ^ LWJ.a- 

xeciD xeciD 

In these four Cases, "period formulas" have been extensively studied. These formulas 
link each L- value (or derivative in Case (D)) to a certain period of a quadratic cycle on a 
Shimura curve. As a corollary each L-value is nonnegative as predicted by the GRH. For the 
convenience of the reader, we briefly locate these period formulas in the literature. Case (A) 
is Heche's formula, see [59]. The formula for Case (B) is due to F. Rodriguez- Villegas and 
D. Zagier [52-54] when the root number is +1 and to Yang [69] when the root number is 
— 1. When X trivial. Case (C) is due to J.-L. Waldspurger [67]. When x is arbitrary and / 
holomorphic, it is due to B. Gross and Zagier [17,20], see also [28,36]. When x is arbitrary 
and / is a Maass form the formula is due to S.-W. Zhang [72] (see also A. Popa [47] for 
real quadratic fields). Case (D) is the Gross- Zagier formula [20] which has been recently 
generalized by Zhang and X. Yuan and W. Zhang [70,72,73]. 

These period formulas yield a closed expression for the moments (1.6) above. Michel and 
Venkatesh observed [41], by analogy with Vatsal's work [66], that these expressions can be 
combined with Duke's theorem to determine the asymptotics of the moments. In [41] they 
address Case (C). Then Case (B) has been treated in [32,37,38] and Case (A) in [61]. 

The Case (D) is more subtle because the period formula involves heights of Heegner points. 
The article [63] provides a short argument that yields a lower bound which is sufficient for 
certain applications. A more ambitious approach that would yield the exact asymptotic with 
power saving for these heights has been developed in the author's PhD thesis [65] follow- 
ing ideas from [40, section 2.4]. This approach contains several difficulties that are not yet 
surmounted.^ 

1.3. An application of Theorem 1. In order to solve Case (D) completely, we shall forget 
about these deep period formulas alluded to in the previous section and go back to pure 
analytic methods that make use of the functional equation only.'' Although we do not make 
it explicit this approach could settle also the Case (C) in a uniform manner^. In some sense 
the estimate (1.2) from Theorem 1 should be considered as lying in the heart of the question 



''Except when / is the level 11 form, where we observed [65, section 6.4] that huge cancellations occur in 
the regularized local heights explicited by Gross-Zagier. 

®I am very grateful to Peter Sarnak who suggested me to do so 
''In Case (A) see also [13], in Case (B) see also [44] 
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of moments of L-functions associated to class group characters, as long as the conductor is 
D\ 

Theorem 2. Let f he a weight 2 primitive modular form of odd squarefree level N . There 
exists an absolute constant r/5 > such that the following estimate holds uniformly on the 
prime discriminants D satisfying Xd{N) = 1, 



in) ? i'(V2,/xx) = 4^i^;;(l^L(i.s>V/) 



1 L'(^) 
-\og\DN\ + jj^{l,XD)+ 



+ -(1, Syrn^ /) - i^(2) - 7 - log 2vr + Of{\D\-^^] 



Here 7 is Euler constant ; L{-,Syvc? f) is the symmetric square L-function ; the superscripts 
in C,^^'^ and L^^^ indicate that the Euler factors at primes divisors of N have been removed. 

As consequence of the Gross-Zagier formula we may deduce very precise informations on 
the height of Heegner points on elliptic curves. This is explained in section 3. 

Remark 10. The asymptotic behavior of ^loglL*! + ^{1,xd) is recalled in § 1.8. As a 
consequence, the brackets in the right-hand side of (1.7) tends to +00 as D gets large, which 
is consistent with the fact that the left hand-side is nonnegative for every D, as follows from 
the Gross-Zagier formula (or would follow from the GRH). 

Remark 11. The residual quantity 77(1, Sym^ /) — 7 — 27r appears in other contexts related 
to height functions, in particular for the self-intersection of the dualizing sheaf of Xo{N), 
see [1]. This is not a coincidence. 

Remark 12. The fact one can bypass the use of period formulas in the proof of Theorem 2 
has a significance and may be exploited further to gain deep insights: a common ingredient, 
explicit or implicit, in all the methods (analytic and geometric ones) is a relative trace formula 
for the arithmetic pair GL2{Q) D Q(\/^)^. The real difference between the geometric and 
the analytic approach lies in the order in which the steps are performed. Hopefully there 
should exist a unifying framework which comprises both period formulas and asymptotics for 
moments of critical values of L-functions. We do not develop the idea further in this paper. 
See also [36,48]. 

1.4. Outline of the proof of Theorem 2. The first task is to express the special value in 
a convenient fashion. This is done by applying the approximate functional equation method, 
see identity (4.10). This method has been used several times in the past and is quite robust 
since it relies only on the functional equation see. For instance it puts Case (C) and (D) on 
equal footing. 

Then it is possible to extract a main term, this is discussed in § 4.3 by means of the counting 
function r^), see (4.11) and (4.16). The remainder term contains a combination of sums of Xf 
against quadratic polynomials and Theorem 1 is exactly what we need to save a small power 
of \D\, see § 4.4. 

1.5. Outline of the proof of Theorem 1. First of all we need to stress out that our proof 
relies on an auxiliary result, Theorem A whose proof will be given elsewhere [64] because it 
involves quite different techniques. The present paper provides all the detailed steps from 
Theorem A to Theorem 1. Main ideas underlying a slightly longer proof of Theorem 1, 
including Theorem A, have been outlined in [62]. 
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The first step, carried out in section 6, is to solve analytically (n^ + d = m) via the 6- 
symbol method [11]. The structure of the argumentation is close to [46]. Roughly speaking 
the effect of the (5-symbol method is to replace the Fourier coefficients A/(m) by sums of 
Kloosterman sums. An important difference with previous applications of the 5-symbol is 
that we are concerned with savings in the sums over the moduli and not only in the square- 
root cancellations of complete exponential sums. Also the choice of certain parameters is 
slightly different. 

The next step is to apply Poisson summation formula, see § 7.3. Then a peculiar kind of 
complete exponential sum shows up, see (5.1). It may be viewed as a generalization of Salie 
sums and carries a square-root cancellation. This cancellation is sufficient to recover the naive 
bound A^^"*"^ in Theorem 1. 

The final saving is included in the sum over the moduli q. This is the object of section 5. 
First we observe that the exponential sum is related to Jacobi forms. Then we quote without 
proof an estimate (Theorem A) which contains the desired saving. This estimate ultimately 
follows from Iwaniec's celebrated bound [30]. 



1.6. Chowla-Selberg versus Gross-Zagier. To our knowledge this is the first time a link 
between these two popular period formulas is stated. Our results imply that when the discrim- 
inant of the quadratic field is large, the Chowla-Selberg and Gross-Zagier formulas become 
very close to each other.® 

This may be visualized by the diagram of "equalities" below. Each equality has to be 
understood up to an explicit multiplicative constant. The error terms and the multiplicative 
constants are discussed at several places throughout the text, the diagram portrays the formal 
aspect. The main term in Theorem 2 may thus be interpreted in a beautiful way: 

E L'{l/2Jxx) ^=^{l.XD) + \\og\D\+0{l) 

(1.8) 

h{^{zD)) %al(^D)+0(l) 

Explanation: the top row is purely analytic in nature (Theorem 2), and very common in the 
theory of moments of L-functions: the moment at 1/2 of a family is asymptotic to a special 
value at 1 of L-functions on groups of smaller rank. The second row is closely related to a 
key result by Fallings that compares the Fallings height and Weil height functions on moduli 
spaces, up to logarithmic terms^. The first column is the Gross-Zagier formula. The second 
column is the Chowla-Selberg formula. 



1.7. Asymptotic height of singular moduli. Let jo be the j-invariant of an elliptic curve 
CM hy Od- The theory of complex multiplication says that is is an algebraic integer unique 



In [34], S. Kudla, M. Rappoport and Yang discuss a distinct situation which involves derivatives of Eisen- 
stein series as a generating series for the heights. In a recent preprint, J. Bruinier and Yang [7] consider yet 
another situation; a difference with our discussion is that they consider the trace of the Heegner points, which 
corresponds to choosing x = 1 in the Gross-Zagier formula (3.1). 

^The Faltings comparison (Proposition 2.3) would yield only a 0(loglog jDj) instead of 0(1) at the bottom 
right. But it turns out that for the special case of Heegner points this may be improved as the Theorem 2 
shows. 
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up to Galois conjugation. In explicit terms, one may choose jn = j{ — ^ ), where 

(1.9) jiz) = ^+7U+ 19688iq + --- , q = e^'^' 

is the classical j-function: ^ C. 

The literature is very prolific on the arithmetic of CM-elliptic curves (see for instance the 
references listed in [6] for the theoretic aspect and listed in [5] for the algorithmic aspect), 
but an answer to the following simple and natural question does not seem to exist What is 
the behavior of the naive height 

(1.10) hijo), as -oo? 

Since this question is partly related to Theorem 2, we take the opportunity to answer it in 
section 2 (see Proposition 2.4) by a geometrical approach, recalling several known facts on 
periods of CM-elliptic curves. Although this question is perhaps known to experts, we believe 
it is important to have a place that discusses it for the sake of non-experts (like the author). 

This question is very natural because the naive height measures the arithmetic complexity 
of an algebraic number. The singular moduli Jd are algebraic integers and it is clear from 
many sources that its complexity grows quickly with the discriminant D. Here are some 
evidences that are related to h{j£i). 

In [19, Table 1] the factorization of the absolute norm of Jd is displayed. The explicit 
formula for this norm proved by Gross-Zagier implies the nice result that the prime factors 
are all less than \D\. Let Pd be the minimal polynomial over Z of jij. It is of degree h{D) 
and sometimes called "class polynomial" because , jd) is the Hilbert class field Hn of 

Q{y/D). For example [71] displays^^ the value of ^-55. A standard inequality for heights 
yields (see [4, Proposition 1.6.6]): 

(1.11) J2 ij'SI = Hmuo) = \ogM{PD) 

Here M(-) denotes the Mahler measure of the polynomial. It is clear (see also [4, Proposition 
1.6.6]) that the latter quantity is larger than: 



:i-12) ^ log|PD(0)| = ^log \^Ho/QjD\ = 



g^Dl 



Quoting [6, p. 378] : "these polynomials are generally quite complicated and the basic problem of computing 
them and their roots has long history" . This is the only answer one usually may read. 

^^It is further observed that the polynomials Pd "have coefficients of astronomical size even for quite modest 
discriminants -D", and the authors introduce and compute a variant called Weber polynomials that have far 
smaller coefficients and still generate the Hilbert class field. However from the point of view of heights both 
Po and the Weber polynomials have, up to a multiplicative constant, nearby asymptotic complexity. One may 
understand why the Weber polynomials are of smaller size, especially for small values of D, by contemplating 
the leading exponent in the Fourier expansion of the Weber function which is to be compared with the 

for the j-function. 
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Actually a simple application of Duke's theorem yields 



12. 



(1.13) ^log \N H ^ /qJ d\ '-^ h{D)h{jD), as D ^ -co. 

It is possible to run a similar argument for the asymptotic of log |Nj:/^/Q(jD — 1728) | = 
log |Pd(1728)| for which an exact prime factorization is also displayed in [19]. We leave the 
details to the interested reader. 

Another interesting quantity is the discriminant of Pd, which is directly related to the 
index Id of in its integral closure. At least when D is a prime discriminant, one has 

(the absolute discriminant of Q(jd) when D is prime is computed in the book [21]): 

(1.14) disciPo) = lh\D\''^ . 

The value of Id is displayed in [19, Table 1], and computed in [19, Corollary 4.8]. From [4, 
Proposition 1.6.9] one has the rough bound: 

(1.15) -^logdisc(Pfl) ^ {2h{D) - 2)h{jD) + log h{D). 
Hd 

It would be interesting, but perhaps difficult, to obtain a good lower bound for Id as D ^ — oo. 
The results in [19] seem to indicate that the growth of Id is indeed very fast. 

1.8. Log-derivative at 1 of Dirichlet L-series. It is convenient to introduce the following 
notation for a quantity that will appear often in the text: 

(1.16) CD:=^log\D\ + ^{l,XD). 

In this paragraph we recall the asymptotic behavior of this quantity. The Riemann Hypothesis 
for L{s,xd) would imply j^{1,xd) = 0(loglog |D|), so that^^ one expects Cd ~ ^logj-Dj. 
Unconditionally it is possible to prove: 

(1.17) {^-e)log\D\^CD<^e\D\' 

for any e > and D large enough. The upper bound follows from Siegel theorem^^. The lower 
bound is a standard consequence of Burgess estimate (see section 3 of [61] for a proof). 

Remark 13. In [9], P. Colmez proves the lower bound log|Z)| ^ Cd- This follows from a 
uniform version of Weyl's law (proposition 5 in [9]) which is very classical in analytic number 
theory (see, e.g., [29, Theorem 5.8]). 



^"^sketch of proof . One needs to control the j|j whose norm are close to 0. Since ~ X{1) and we may view 
X(1)(C) as the hyperbolic quotient 5'L2(Z)\^), this is the same as controlling how close the Heegner points 

of discriminant D may be to p = e^^^^ = — ' ^^^^ logarithmic distance is at least log D as one may 

deduce quickly from the explicit representation ^ Heegner points. And Duke's theorem states that 

{jS} fl^re equidistributed for the hyperbolic measure. This is enough to conclude that the negative contribution 
X^CTgcijj 1*^8" liSI is o{h{D) log \ D\), which is what we need. The error term is obviously poor since one had to 
isolate a small region around p and to apply Duke's theorem afterwards. 

"'^''in [431 it is proven unconditionally that limsup , ^l',^'^? , „ ^ 1/2 and liminf r/-, ^l',^'^? ,r^, € —1/2. 

>- i J L(l,xr>) loglog|C| / o^-oc '^(^'XD)loelog\D\ ^ ' 

This tends to show that these quantities are indeed delicate 

"'^^and it is very difficult to improve it unconditionally. As explained for instance in [43, Theorem 4.2] such 
an improvement would be intimately related with the absence of Siegel zeros 
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1.9. Notation and convention. For notational simplicity we shall prove the estimate with 
•q = r]' . We shall label the successive exponents arising in the sequel in the following manner: 

(1.18) < < v < m < m < m < m- 

The exponent rji arises in Theorem A. Then r]2 will be chosen sufficiently small compared to 
r/i and so on. The exponent r] is the one from Theorem 1. We did not compute its precise 
value. The exponent r/5 appears in the proof of Proposition 4.1 and thus in Theorem 2. It is 
more customary in analytic number theory to keep these choices implicit, but we believe this 
labelling improves the clarity. 

For the height functions, we adopt the conventions from [4]. If L is an ample divisor, 
denotes the composition of the naive height with the map to projective space induced by L. 
If the underlying variety is abelian, denotes the canonical height. The 0{), o{), ~, ^ and 

have their traditional meaning. 

1.10. Structure of the paper. The proof of the main Theorem 1 is performed in § 7. The 
§ 5 contains the estimate on sums of exponential sums while the § 6 builds the variant of the 
circle method. 

Th. A ^ 5 *- 7 4 ^ 3 

I 
I 
I 

6 2 

The proof of Theorem 2 is performed in § 4. The application to height of Heegner points is 
exposed in § 3. This is to be compared with a geometric approach in § 2. 

1.11. Acknowledgments. The article is partly based on Chapter 12 of the author's PhD 
thesis [65] and some of the results have been announced in [62]. My indebtness goes to 
my advisor Philippe Michel for his constant support. I thank Peter Sarnak for insisting on 
developping the approximate functional equation method for the present family: at that time 
(May 2007) it was not clear that an estimate like (1.2) would exist. I also want to express my 
gratitude to Philippe Michel and Akshay Venkatesh for letting me search on these problems 
although they already had distinct interesting ideas (see [41], [40, § 2.4]). I thank Gergely 
Harcos for introducing me to some of the subtleties of the Shifted Convolution Problem. My 
final thank goes to the book [4] . 

2. Heights of Heegner points - geometric approach. 

Before proceeding in detail with the proofs of Theorems 1 and 2, we discuss a geometric 
proof of a weak version of Theorem 2. The techniques of this section are in a very different 
flavor than the rest of the text and the reader interested solely on L-functions may skip this 
section. We believe this section will be useful for the reader to gain a better understanding 
of the objects underlying the moments of quadratic L-functions. 

Let E he a rational elliptic curve. Let N be its conductor and 93 : Xq{N) — > be a Weil 
parametrization which exists by Wiles celebrated theorem. Let h : E{Q) M_|_ be the Neron- 
Tate height. Let D he a, fundamental negative discriminant such that the Heegner condition 
is satisfied: all prime factors of N are split in Q{\/TD). We choose one Heegner point of 
discrimant D on Xq(N). 
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The quantity ,^}^ is an arithmetic invariant of the couple (E, D) formed by an elhptic 

deg[ip) 

curve E/Q^ and a compatible discriminant D. Actually it depends only on the isogeny class 
of E. We are interested in its behavior as D gets large. 

In [63], we established liminf /i(99(2:£))) > by an equidistribution argument that works in 

D^— oo 

a fairly general situation. We observed also [63, §4] that in the present case of modular curves 
Xq{N), it is possible to use the geometry of the cusps via rough comparison arguments and 
established: 

(2.1) H^izD)) >E Cd. 

In the next proposition we shall refine this last result. The proof of the proposition occupies 
§ 2.1 to § 2.6. We may view the present section as a complement of section 4 from [63]. Let 
g{N) be the genus of Xo{N) and iy{N) := [5L2(Z) : ToiN)]. 

Proposition 2.1. Let notations and assumptions be as above. Then: 

Kfizp)) 6 

{2.2) — ; ~ ^T77i-n, as iJ — > — oo. 

^ ^ degif v{N) 

Remark 14. It seems difficult to have a good control on the quality of the asymptotic (2.2) 
from the geometric approach. This mainly comes from the Proposition 2.5 which does not 
give an explicit error term but merely the existence of a limit. Also Faltings approximation 
result contains a log log |D| in the remaining term which is difficult to remove. 

During the proof we shall establish^^: 

Lemma 2.1. Let h : Xo{N){Q) M+ be as in [20], see also § 3.5. 

(2.3) h{zD) ~ ^^^^D, asD^ -oo. 

Remark 15. One may decompose h on JoiN) as a sum of the Neron-Tate heights on its 
simple abelian quotients. From this fact one may deduce Lemma 2.1 from the analog of 
Proposition 2.1 for modular abelian variety (which is also proved in the next section 3 by 
analytic methods). 

However this decomposition itself is useless in the proof of Proposition 2.1. For instance a 
divisor on Jo{N) may project to zero or to a torsion point on E. In the argument below we use 
the fact that the Heegner points really belong to the curve Xo{N) inside Jq{N). Precisely, we 
make use of Proposition 2.5 which automatically removes this possibility (at least for points 
of large height). 

Remark 16. The arguments provided below may be compared with section 4 from [63] in a 
fairly precise way. Although stated in a different language, both proofs are in the same flavor. 
The § 1.8 discusses [63, Lemma 5]. The Proposition 2.2 below covers [63, Lemma 6]. The 
Proposition 2.3 covers [63, inequality (24)]. The Propositions 2.4 and 2.5 cover [63, inequalities 
(25-28)]. 
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An explicit formula for Heegner points on Shimura curves of full level is the main purpose of [34] 
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2.1. A formula by Chowla and Selberg. In the early 80's and 90's, articles have been 
written on the periods of CM elliptic curves (and more generally of CM abelian varieties). In 
this paragraph we briefly recall the formula we shall need. 

Recall that there is a notion of Faltings height of an abelian variety defined over Q, see 
e.g. [16]. Let Ed be an elliptic curve over Q with CM hy Od- We refer the reader to the 
book [21] for a discussion of the arithmetic properties of these curves. The connexion to 
periods of CM elliptic curves (and abelian varieties) was first observed by P. Deligne. 

Proposition 2.2 (Chowla-Selberg). The Faltings height of Ed depends only on D and is 
equal to: 

(2.4) 2hFai{ED) = /:d + c, 

where c is an absolute constant^^. 

A proof is to combine Kronecker limit formula for Eisenstein series on SL2{Z)\Sj and the 
Hecke period formula. The reader is referred to [8] or [34, Proposition 10.10] for further 
discussions around that formula. 

2.2. Approximation of the Faltings height. In his proof of finiteness theorems for abelian 
varieties, Faltings [16, §3] shows that, up to logarithmic terms, the Faltings height is a multiple 
of the height of the abelian variety on the moduli space (with respect to an embedding to 
projective space which is defined in a canonical way).^^ 

For elliptic curves, one may find a nearby discussion of this fact in [60, Proposition 2.1]. 
Note that our definition of the Faltings height differs from [60] . 

Proposition 2.3. Let E be a semistable elliptic curve defined overQ of j -invariant Je ■ Then 
(the constants are absolute): 

(2.5) 0(1) ^ HUe) - UhpaiiE) ^ 61og(l + h{jE)) + 0(1). 

2.3. Asymptotic height of singular moduli. The j-invariant of Ed which we have denoted 
Jd is unique up to Galois conjugation. From Propositions 2.2 and 2.3 and § 1.8 we deduce: 

Proposition 2.4. The naive height of Jd satisfies the following asymptotic: 

(2.6) h{jD) ~ 6££), as D ^ —oo. 

2.4. Image of points of large height. Let's recall the following, see [25, proposition B.3.5]: 

Proposition 2.5. Let X be a smooth projective curve defined over Q. Let A,B be divisors 
on X with deg{A) ^ 1. Then: 

PeX(Q) hA{P) degA 



we do not display its exact value because it depends on the chosen normalization of /ipai which varies 
from an article to another. 

"'^'''this construction is better viewed in the language of metrized line bundles for Arakelov geometry 
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2.5. Explicit degree of certain divisors. Consider the map l : Xq{N) Joi^) from 
the modular curve to its Jacobian which sends the cusp too to the origin. Denote by vr : 
Xo{N) X{1) ~ the standard projection which is of degree i^{N) = [5^2 (Z) : ro(iV)]. 
All the morphisms in the following diagram are defined over Q, which enables to consider the 
image of the Heegner points zd- 

(2.8) pi^^Xo(iV)^^ Jo(iV) 




E 



We shall need the precise value of the degree of certain divisors on Xo{N) that are pullbacks 
by the above maps. 

Lemma 2.2. Let G be the theta divisor on Jo(iV) and E := Q + [-1]*B; let O := 0{0) be 
the line bundle associated to the origin (O) of E. Then: 

(i) deg(/7*0 = deg(/?; 

(ii) degi*H = 2g[N) (the pullback is in the sense of line bundles or divisor classes); 

(iii) deg7r*C'(l) = degvr = iy{N). 

Proof, (i) and (iii) are obvious. Assertion (ii) is classical, see section 8.10 from [4] for instance. 

□ 



2.6. Proof of Proposition 2.1. Recall that zd E Xq{N){Q) and that the point tt{z£)) 
corresponds to an elliptic curve with CM hy Od hence is conjugate to jo- 

We begin by the asymptotic of h{z£i), where we recall that the height h on Xo{N) is such 
that ho i = hw. By the formalism of height functions (see, e.g,[25, B.3.2]), one has: 

(2.9) h{zD) = hs o l{zd) = hs o l{zd) + 0(1) = K,s{zd) + 0(1), 
and: 

(2.10) h{jD) = h o tt{zd) = h^,o(i){zD) + 0(1). 

From Proposition 2.4 and the lower bounds from §1.8 we deduce h^,Q(i~^{zD) oo. Since the 
degree of Tr*0{l) is positive, we may apply Proposition 2.5 which yields: 

^ deg i*H 

(2.11) h{zD)^- ^^^^(Jd), asD^-oo. 

deg7r*C(l) 

By definition of the Neron-Tate height we have h = ho + 0(1) on -©(Q). Hence: 

(2.12) h{^{zD)) = h^*o{zD) + 0(1). 

Since the degree of l*E is positive, hL*s{zD) ^ cxd by the previous result. We may again 
apply Proposition 2.5 which yields by (2.11): 

(2.13) kW,o)) ~ ^^^H^o). 

Making use of Lemma 2.2 and Proposition 2.4, we conclude the proof of the Proposition 2.1. 



13 



3. Heights of Heegner points - analytic approach. 

By the Gross-Zagier formula, the quantity L'[l/2, / x x) is proportional to the Neron-Tate 
height of ^{zd) introduced in the previous section. Theorem 2 then yields precise informations 
about these heights. Although it is possible to carry out the study in greater generality, we 
stick to the initial Gross-Zagier context [20] which we now proceed to recall. 

3.1. The Gross-Zagier formula. Assume that the Fourier coefficients of / are rational and 
let E be the rational elliptic curve associated to / by the Shimura-Taniyama construction and 
(f : Xq{N) — > As in the previous section we assume that the Heegner condition is satisfied 
which implies Xd(A^) = l, thus an odd functional equation for f x x)- 

The Gross-Zagier formula [20, §1.6] yields: 

(3.1) ^5:r(l/2./x,)=„L,,,„)ta). 

By combining the formulas given in [20, pp. 230, 308, 310], one has the following^^ (see 
also [50] or [49, Remarque 5]): 

2_/v 

(3.2) a = — L(l,SymV). 

3.2. Refined asymptotic for tlie lieight. Now we may explain the arithmetic significance 
of the moment in Case (D) (cf. the introduction § 1.2 and Theorem 2): 

Corollary 3.1. Let assumptions be as above and assume D is prime. Then: 



CD + hf + Of{\D 



as D ^ — oo. 



^^^^ p\N ^ 

where: 

(3.4) hf := ^(1, Sym2 /) - ^(2) - 7 - log 27r + ^ log + J] 

This result follows from (3.1), (3.2), Theorem 2 and the fact that all prime factors of N are 
split in Q(\/l)). It is consistent, except for a multiplicative constant^^, with Proposition 2.1. 

Remark 17. This asymptotic improves on a recent result by G. Ricotta and T. Vidick [50, 
Theorem 4.1]. Their result concerns the average of ^^gg^^) over Y < D < 2Y, with Y 00. 
The leading term is of the form (see also [49]): 

(3.5) logY + h'f + 0{Y-^). 

If we average (3.3) we indeed recover that result because the average oi Co is proportional to 
logy (one may also check that the average of /i/ agrees with h'j-). 

More precisely our result uncovers the apparent complexity of [50, Figure 1] which plots 
the values of (3.3) with E an elliptic curve of conductor 37 and |-D| going up to 5.10^ The 



"'^^In § 3.5 we give further details on this equahty 

"'^^there is a discrepancy by a factor 2 between the two results. The author has tried for a long time to settle 
the exact value of the constant. It is really difficult to do so in view of the number of distinct manipulations 
involved to establish Proposition 2.1 and Corollary 3.1 and the Gross-Zagier formula [20]. Perhaps the 12 
should be 6? We couldn't decide whether the mistake arises in the present article or in one of the formulas we 
quote from the literature 
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general trend is a logarithmic growth (which is consistent with the bound Cd ^ log \ D\) but, 
as the authors pointed out, the growth seems to be "very irregular". We may now explain 
this phenomenon by the fact that Co — ^logl^l = ^{1,xd) may take exceptionally large 
values (positive or negative), especially when the class number h{D) is exceptionally small, 
which may happen in that range of discriminant. See [43, Figure 1] for a plot of 77(1, Xd)- 

3.3. A challenging remark... If one inspects the geometric approach of the previous section 
one may see that it is possible to prove: 

(3.6) ^ Yl L'{l/2,f xx)^CfL{l,XD)log\D\, for L» large enough, 

without making use of any deep analytic estimate for quadratic L-series. Here C/ > 
depends^*^ only on /. Indeed we first make use of the Gross-Zagier formula (3.1), then the 
ingredients involved in the proof of Proposition 2.1 from section 2 consist of generalities on 
height functions plus the Chowla-Selberg formula. As recalled in § 1.8 the bound » log \ D\ 
follows from Weyl's law on the zeros of L{s,xd)- 



3.4. ...and a reservation. However if we compare the situation to other L-functions asso- 
ciated to quadratic fields (Cases (A-D) discussed in the introduction), it is possible to make 
the previous observation slightly less surprising. 

In Case (C), Waldspurger formula combined with the fact that cusp forms are bounded 
yields at once a 0(1) bound for the corresponding moment. But it is a consequence of Duke's 
equidistribution theorem that the moment has a positive limit as D ^ —00 [41]. 

In Case (A), a similar discussion occurs in [13] which is even closer to our situation. The 
authors explain that the proof of [13, Theorem 2] is made "using mostly elementary means" 
and still provide an asymptotic for the second moment - this is to be compared with Propo- 
sition 2.1. On the other hand the proof of [13, Theorem 3] demands "a lot more work" and 
the use of Duke's theorem - this is to be compared with Corollary 3.1. 



3.5. Appendix — on multiplicative constants. The determination of the value of a is 
quite puzzling since the normalizations in [20] are not always standard and are scattered 
through the text. Its exact value is important for us to check the consistency between section 2 
and 3. In this paragraph we give some details. We hope this will be helpful to gain a better 
understanding of the underlying quantities. 
Consider the diagram: 

(3.7) Xo(iV)^^ MN) ^ 




E 



The genuine Gross-Zagier formula, as it is proved in [20, Theorem 6.3 § 1.6] or [72, Theorem 
1.2.1] or [70] is the identity: 

(3.8) ^L'(l/2, / X x) = 16^(/, /)L(1, xd)%{zt,) J, 



it is effective but the "for D large enough" is not. 
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where in the right-hand side it is meant the /, x-isotypical component. It is possible to infer 

the equahty M^iiH = h{i{z)f ), see [20, p. 310]. From the relation (/, /) = g^L(l,Sym2/) 
aeg (p 

one deduces the value of a given in (3.2). 

4. Proof of Theorem 2. 

4.1. Rankin-Selberg L-functions. The assumptions are as in Theorem 2. From Rankin- 
Selberg theory we have a convolution representation of the L-function (see [20, Chap. IV 
(0.2)] for a proof of the following properties): 

oo 

(4.1) L{s,fxx) = L^''H2s,XD) x(a)Aj(Na)Na-^=:^^, say. 

aCOo "=1 



The sum is over ideals of the ring of integers Od of Q{\/D). We have an holomorphic 
continuation and if we set: 

(4.2) A(s, / X x) := |A^I?|' rK(s + ^)T^{s + ^)L{s, f x x), 

where r]R(s) := 7r~'*/^r(|), the functional equation reads: 

(4.3) A(l-s,/xx) = -A(s,/xx), Vs G C. 

4.2. Approximate functional equation. Recall that the Dirichlet L-series associated to 
principal automorphic representations are absolutely convergent for JRes > 1 and the func- 
tional equation (4.3) links it to JRes < 0. The values lying in the "critical strip" ^ Kes ^ 1 
own the deepest arithmetic glint on its coefficients (an)n^i. In (4.8), L'(l/2,/ x x) is ex- 
pressed as a weighted sum of the first |A^D| -coefficients, the so called "approximate functional 
equation" method. This procedure is classical and we shall recall briefly what we need here, 
referring to [22] or [29, §5.2] for details. 

Set Loo{s) := T-s_{s + ^)T^{s + |), so in particular Loo(^) = vr^-*^. Let us choose once and 
for all a meromorphic function G such that: 

• G is holomorphic on C except at 0, where we have: 

(4.4) G(s) = l + 0(1), s^O, 

• G is even: G{s) = G{-s) Vs / 0, 

• G is of moderate growth (polynomial) on vertical lines. 

(Actually one may simply choose G{s) := but there is no harm in retaining this degree 
of generality: mainly (4.7) is needed in the sequel). Let V E be defined by: 

r ^ ds 

(4.5) V{y):= I ye(0,oo). 



sRes=2 



where 

(4.6) I>(s) := 7rLoo(^ + s)G(s), s G C - {0}. 
It is not difficult to check that: 

(4.7) y(.) = l-2l±l^^ + 0(l), .^0. 

s 
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A standard contour argument shows, as consequence of (4.3), that: 



n 



(4.8) L'{l/2Jxx) = 2j2^y{ 

n=l 

The sum is rapidly convergent and more precisely we have the following estimates. 
Lemma 4.1. For every integer j G N i«e have: 



(4.9) y(^)(y) 



' - (log 2/) + Oj (y ) when < y < 1 , 
OA,j{y~^) when I ^y, for all A> 0. 



Proof. When < y ^ 1, move the line of integration in (4.5) to JRes = — ^, crossing a pole 
at s = of residue — log y and estimate the remaining integral with Stirling formula. When 
y ^ 1, move the line of integration in (4.5) to SRes = A. See also [29, Proposition 5.4]. □ 

From (4.1), the identity (4.8) and by the orthogonality of characters on a finite abelian group 
we deduce the following formula that will be our starting point for the proof of Theorem 2: 

(4.0) ^ E t '-^t'-^n^^i 

^ ' ..cm m=l n=l ' ' 

(m,7V)=l 

Here r£){n) is the number of elements of Od of norm n, that is: 
(4.11) 2r^(n) := #{(a, b) G - b'^D = An} 

(we have assumed \D\ ^ 7 odd). 

4.3. Main term. Since D appears several times in identity (4.10) it is not clear a priori 
what is the main term as D — > — oo. The aim of this paragraph is to give some explanations 
of how to riddle where it comes from.^^ 

Because of the weig ht ^ the m-sum diverges gently enough (logarithmic growth) so that 
the sign X-d("^) cannot really matter. The n-sum is really the key. 

Let us consider the terms^^ that correspond to 6 = in the "counting function" r^- We 
shall show that these terms contribute a positive amount to the asymptotic (and it turns out 
that this will indeed constitute the main term of Theorem 2). 

To have a feeling of this, one may view r£)[n) as a "probability density function" against 
which we sum the eigenvalues Xf. When 1 ^ An < \D\, the density is located on the perfect 
squares, each of the same weight. Observe that this contribution comes from the 6 = terms 
only. This set is fixed and captures small Hecke eigenvalues of so that it cannot cancel out 
(and has to contribute to the main term of the final asymptotic). 

When \D\ ^ An < \ND\^~^'^ one could say that the density is less sparse^'^ and we shall see 
in the next paragraph that when summing Xj against it, we indeed obtain cancellations (also 
observe that the weight -^j^ diminishes the individual values of the summand) . 



see also [49,50] for a nearby discussion where the average over D simplifies the situation. 
^■^One could view these as diagonal terms by analogy with classical situations 

this picture is not entirely truthful since we shall apply Theorem 1 which exhibits cancellations against 
the sparse sequence n t-^ + d 
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Lemma 4.2. The contribution in (4.10) from the terms b = is asymptotic to 
(4.12) 



4 ^(^'Sym /) 



1 L'(^) 
-log|Z)iV| + -^(l,XD)+ 



+ -(1, Sym2 /) - i^(2) - 7 - log 2vr + 0^(|DrV32+^) 



-(l,Sym2/)-^ 
Proof. The contribution is equal to 

(4.13) 2 f ^MfM^!)^.,!^;^,, 

m=l a=l 
(m,N)=l 

Recall that {N is squarefree): 

(4.14) L(s,Sym2/) = C^'^)(2s)5]^^, for3?es>l. 

n=l 

From (4.5) we deduce that the contribution is also equal to: 

(4.15) 2 / ^:ii^^^f±ll^L(2. + 1, Sym^ /)F(.) \ND\^ ^. 
^ ' 7(2) C(^)(4s + 2) ^ , y Jj y j\ \ 2iT, 

We move the line of integration to 3ftes = — |, crossing a pole at s = 0. The residue is as 
given in (4.12), as one may check from (4.7). The remaining integral is bounded thanks to 
the rapid decay of V{s) as Qms ±oo and Burgess subconvexity bound. □ 

4.4. Remaining terms. In view of the discussion in the previous paragraph, it is natural to 
introduce the function: 

(4.16) r^^{n) ■.= #{{a,b) gZx-M'', a^-b^D = An]. 

From (4.10), Lemma 4.2, and the forthcoming estimates it is easy to complete the proof of 
Theorem 2. Observe that since is fixed the two ranges for the m parameter in (i) and (ii) 
of the following proposition overlap to a large extent when D — > — oo. 

Proposition 4.1. We have the following uniform bounds, 
(i) When 1 ^ m ^ 

n=l ' ' 



(a) When 2y/ N ^ m < oo: 
(4.18) i;^%^^n^)«/AeA^^|I^I^/^---^^ forallA.e>,. 



n=l 



Proof, (ii) When 2y/N ^ m < oo, the estimate comes from the rapid decay of V. Indeed 

J. 2 2 

r\j{n) > only if n ^ /4, in which case ^ ^ ^ 1. Therefore we may apply the 
second estimate in Lemma 4.1. 

(i) Assume now that 1 ^ m ^ and write 4n = — b^D. When b ^ 2\/iV, we have 

2 

again ^ 1 and we apply Lemma 4.1 as before. This yields a negligible contribution as 
soon as 6 ^ IL'I''^. 
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Prom now on we assume that 6 ^ |-D|^^. In a similar manner, we may assume up to a 
negligible term that \a\ ^ \D\^^'^^^^ . The contribution from a = is clearly negligible and 
thus it remains to estimate: 

where W{y) := V{y)y^^^'^ . Introduce: 

(4.20) := Xfia^ - b'^D), x e M+, 

SO that after integrating by parts we need to estimate: 



m 



3 r\D\ 



(4-21) — / S^W'i 



^1 AaP' — b'^D)m? xdx 



\ND\ '\ND\ 

We have y := (^'y^'^)"^' ^ so that W'{y) is bounded by 0(iV3/2) ^ q^^;^)^ 

We make use of Theorem 3 to bound Sx- A straightforward dyadic subdivision yields: 

(4.22) Sx «/ IL*!'/'"" b^ log |Z)| , Vx < . 
Inserting this bound in (4.21) yields 

(4.23) iZ^r^ iD^s 

which concludes the proof of the proposition. □ 

5. On quadratic exponential sums. 

As we shall see in the context of the proof of Theorem 1, the following exponential sum 
arises naturally when one applies the (5-symbol method (see section 6 and identity (7.4)): 

(5.1) - S(m,n'^ + d; q)e„(ln). 

q ^ 

This sum carries a square-root cancellation in the sense that its typical size is r(g) (as q 
gets large). As explained in the introduction, this cancellation is not enough for our purpose 
and we shall need quantitative oscillations of the "angle" (argument) as q varies. In more 
concrete terms this means cancellations when summing over q in an interval. 

In this section we shall claim an estimate which is what we need to prove the main Theorem, 
see bound (5.7) in Theorem A. Ultimately the estimate would rely on Iwaniec's celebrated 
estimate for Fourier coefficients of half- integral forms [30]. We have decided not to include 
the proof of Theorem A here because it is tedious and requires the introduction of a large 
number of objects. For these reasons and for the sake of clarity we postpone^'* the complete 
discussion and proof to the companion paper [64]. 

Remark 18. It took a long time for the author to study and uncover the properties of 
the exponential sum (5.1). In the following we present the quickest way to deal with it by 
recognizing a link with Jacobi forms. In the author's PhD thesis [65] we have established (5.7) 
under certain coprimality assumptions which would be enough for the proof of Theorems 1 
and 3, see [62] for an outline of a possible method via explicit evaluation of twisted Salie sums 
and the equidistribution of roots of quadratic congruences [13,27]. 



^^We apologize to the reader if as a consequence the content of this section might appear a Uttle mysterious 
at first sight. 
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5.1. A family of exponential sums. The following exponential sums appear in the Fourier 
expansion of Poincare series for Jacobi forms, see [15, part 1] (Eisenstein series) or [18, §11.2] 
(general case). 

Definition 5.1. For q ^ 1 and ?^l , "^2 , ri , r2 G Z, let: 

(5.2) J{ni,ri;n2,r2;q) := -e2q{rir2) ^ eq{{y'^ + riy + ni)x + n2X + r2y). 

xG{1/qZ)^ 

It is clear that we have the identity: 

(5.3) - S{m,ri^ + d;q)eq{ln) = J{d,0;'m,l;q). 

These exponential sums enjoy many properties, for instance the symmetry between the 
indices 1 <-> 2. Here we recall the twisted multiplicativity property which is a straightforward 
consequence of the Chinese remainder theorem. For q,q' ^ 1 with {q,q') = 1, one has: 

(5.4) J{ni,ri;n2,r2;qq') = J{niq''^ ,riq';n2,r2;q)J{niq'^,riq;n2,r2;q'). 

5.2. Sums of exponential sums. This section contains the technical estimate that we shall 
need in the proof of Theorem 1. 

Theorem A. Let ni,n2,ri,r2 G Z be such that rf — Ani or r| — 4n2 is non-zero. Put: 

(5.5) C := {\rl - 4ni| + l)(|r| - 4n2| + 1). 
(i) For all e> 0, 

(5.6) J{ni,ri,n2,r2;q) <^e (qCY- 
(a) For a ^ 1 one has the following uniform estimate: 

(5.7) Yl J{nuri,n2,r2;q)<.Q^-'>'a^ 

Q<q<2Q, 
q=0 ( mod a) 

valid for all Q with 1 ^ Q < C^/'^+^'^ . Here rii,A>0 are absolute constants. 
A proof of these estimates is the main object of [64]. 

5.3. A reduction. The following lemma will allow the use of Theorem A in the presence of a 
residual inverse M2 such that M2M2 = 1 (mod q). This occurrence will appear in the sequel, 
see equation (7.4). 

Lemma 5.1. Let D be a fundamental discriminant, l,m € Z and e ^ 1 be integers. Let 
A/2 ^ 1 6e odd squarefree and coprime with D and q. Introduce M2 = with M^le and 

(A/a, e) = 1 and put e' := e/M^. We have the equality: 

(5.8) J{-e^D,0,mM^,l;q) = XDiM3)Ji-e'^D,0,mM2,W2;qM3). 

Remark 19. It is important to observe that the identity (5.8) (nor any naive variant) is not 
true in general without the coprimality assumptions. We see this clearly in the proof where a 
multiplicative factor has to be non-zero. This is the main obstruction why we have made the 
restrictions on the level M and the primality of D in Theorem 1. 
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Proof. Since q is coprime with J\f2 we have by twisted multiphcativity: 

(5.9) _ 

J{-e''^D, 0, m^2, 1^2] qMs) = J{-e^DMi, 0, mj\f2, IM2; q)J{-e'^Dq^, 0, mj\f2, ZA^a; AA3). 

The first term of the right-hand side is equal to J{—e'^D,0,mJ\f2,l;q) by change of variable 
{x,y) (A/'2 a;, A/2y) in the Definition 5.1 of J. 

The second term of the right-hand side is equal to Xoi-^s) which concludes the proof of 
the lemma since Xoi-^s) is non-zero. Indeed it is not difficult to see that this term is equal 
to (A/s is squarefree coprime with e'): 

(5.10) J{-D,0,0,0;M3)=ll Yl ep{{n^-D)x) 



Expliciting the Ramanujan sum (x-variable) the last sum is: 



(5.11) #{n(p);n2 = D{p)} - 1 = j^^J = xd{p). 

The last equalities hold because pjA/s is odd. □ 



6. A VARIATION ON THE (5-SYMBOL METHOD. 

Despite the apparent routine of this section, the estimates are really delicate. The variable 
is particularly sensitive: for instance the {qQ -|- |n|)~^ from Lemma 6.1 cannot be replaced by 
q~^Q,^^ without damaging the proof in the next section. 

6.1. VoronoT summation formula. In detecting cancellations in sums of Fourier coefficients 
the Voronoi summation formula is a convenient and classical tool. We shall use the following 
variant, borrowed from [33, Theorem A. 4]: 

Proposition 6.1. Let f be a primitive new form of weight 2 and level M. Assume that q is 
such that {q, y) = 1- Let Mi and M2 ^ 1 be such that 

(6.1) M = NiN2 ; Ni = iq,M) ; (g, AA2) = 1. 

Let d be an integer prime with q and g be a smooth function of compact support. Then: 

(6.2) f; Xf{m)eC^)g{n) = -2vr^^ X^{m)e{-m^)g{m; q) 

m=l ^ ^ m=l ^ 

where ??/(A/'2) is a complex number of modulus 1 and: 

(6.3) g^y.q):= r g^y)j^(^t^)dy. 

Jo 9VA/2 

Remark 20. When J\f is squarefree, the condition on q is always fulfilled. In the sequel we 
do place ourselves in this case and shall use the decomposition M = A/1A/2 from (6.1) without 
further indication (but one should be aware that A/i and A/'2 depend on q, or more precisely 
on iq,Af)). 
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6.2. Setting-up the 5-symbol. The capital letters C/, $7, Q shall denote the length of various 
sums. We postpone the definitive choice of these quantities until the next § 6.4 for the sake 
of clarity. These choices are slightly unusual. The principles underlying this section are 
known, and we shall follow [11] closely. A difference is that we shall need here a control 
on the smoothness in the q-variable. This aspect is crucial for our purpose (recall that the 
cancellations ultimately come from the g-sum, cf. Theorem A), therefore we include brief 
proofs of the key estimates. 

Fix once and for all a function W € C^(( — 1, 1)) with W{0) = 1 and put cf){u) 

■u S M. Let w be a smooth function of compact support in (fi, 2il) such that (the constants 
are absolute): 



(6.4) 



E 

r=l 



u)[r) 



1 ; <i n 



Vi G N. 



The (5-symbol, which is 1 if n = and else, is expressed with additive characters (Ramanujan 
sums) : 



(6.5) 

where 
(6.6) 



5(n) = </)(n)^A,(n)5]*e(^: 

q^l dig) ^ 



Vn G 



oo ^ 



uj{qr) - Lo{ — ) 
qr 



for u £ 



Lemma 6.1. (i) The function Aq(f) identically vanishes unless 1 ^ q ^ Q, where Q :- 
max(r2, — ). 

(a) The high derivatives of Aq satisfy, for all i > 0: 



(6.7) 



A» (..)«, {qny 



u e 



(Hi) We have the following uniform bounds (u G M): 
(6.8) Ag{u) <^ Q-^ + {qQ + \u\ 

(iv) We have the following bound for the derivative: 



(6.9) 



-^Ag{u) q-^n-^ + q-^n-\ 

oq 



Remark 21. The bounds in (6.7) and (6.9) are uniform in the tt-variable, which is sufficient 
for our purpose. On the other hand the presence of u as {qQ + |?/|)~^ in (6.8) is necessary in 
the sequel. 



Proof. Claims (i) and (ii) are immediate. The proof of claim (iii) may be found in [11, 
Lemma 2]. We repeat it for convenience. We use of the inequality {^} ^ min(l, ^). Observe 

that 

f°° II dr 

(6.10) / (..(r)-a;(-))-=0. 
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This imphes (Euler-Maclaurin formula of order 1): 

Jo Q r Jo r q 



(6.11 



r 



The proof of (iv) is similar, write: 



UJ 



dq ^-^ q^r qr q q^r"^ qr 



q qr T qr q qr^^ r ^ 

« / -\d-uj{-)\ + \d-u{r)\ + \d-J{r)\ + -\dA^uj'{-)\ 
Jq q qr r qr q q qr^ r 

<. q-'^Q.-^ + q-^Vt-"^ + q-^Q.-"^ + q-'^Q.-^ . □ 

Let /i ^ 1 be an integer and assume from now that U is chosen such that: 

(6.12) U ^ h/2. 

In particular m ^ Aq(l){m — h) vanishes unless m ^ 1. Writing \f{h) = Xf{m)6{h — m) 
and inserting the expression (6.5) for the (5-symbol yields: 

^ h} /~l 'VY^ 

(6.13) \f{h) =Y.Y. ^(-^-) E A/(m)A,<^(m - h)e{ — ). 

q^l d{q) ^ m^l ^ 

We may apply Voronoi summation formula to the m-sum because the function 

(6.14) g{x;h;q) := Aq(j){x - m) 

is of compact support thanks to the function This gives an expansion of in terms of 
sums of Kloosterman sums: 

Proposition 6.2. Under condition (6.12), we have: 

(6.15) Xf{h) = -27T A/(m) J] ^^^^(mA/a, h; q)g{m; h; q), 
where S{-,-;q) denotes the classical Kloosterman sum and: 



(6.16) g{y;h;q):= g{x; h; q)Ji{ j=^)dx. 

Jo 

Remark 22. If the weight of / were ^ 4 (as in [3]), we could have used the fact that Poincare 
series span the finite dimensional space of holomorphic forms of level M; and combine this 
with their explicit Fourier expansion (which is close to the right-hand side of (6.15)). This 
approach doesn't work for Maass forms and weight 2 forms. Also the 5-symbol offers more 
flexibility in the choice of the test function: here the function g shall decay rapidly as y ^ oo 
and vanishes unless q ^ Q. See [46, Introduction] for a similar discussion. 

The proof of the following lemma is straightforward (we make use of the fact that q ^ Q ^ 

u/ny. 
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Lemma 6.2. (i) Unless x £ (h — U,h + U), g{x; n; q) vanishes. 

(ii) The high derivatives of the function x i-^ g{x;n;q) satisfy (i G Nj; 

(6.17) g^^{x;h;q) <i Ci x mm(;7, gJ^)-\ 

The factor Ci = Ci{\D\,U,Q,) is a polynomial in \D\,U and O whose coefficients do not 
depend on i. 

The following estimate is classical but we shall provide a quick proof because of its impor- 
tance. 

Lemma 6.3. The Haenkel transform g satisfies, for any integer A > 0.' 

(6.18) g{y; h; q) «a C2 x (|) mm{U/q, f])-^^. 
Here C2 = C2{y, \D\, U,il.) is a polynomial in y, \D\, U and fl. 

Proof. The basic idea is to integrate by part the Bessel function. An elegant way is to use 
the following formula, see [23, p. 51] or [24]: 

2A 

(6.19) Ji(Vi) = Y^Ca,Az''-^ [Ji+,(Vi)]^"^ . 

a=0 

The constants Ca,A are absolute ; in the following a or a denote an arbitrary integer between 
and 2A. 

r g{x).h{^^^)dx <^ r g(^)J,{^z)dz ^^aY] H \g{^)z^-^Y''^ Jl+a{V~z)dz 

Jo Q Jo y Jo L y J 









ai — 


Jo 


L y J 



Z" -Jl+a{\/z)dz 



«aC2X V min([/,gf])--(^)7-^^^" ^ 

«AC2xf^)"^j; (—777 

In the second line we have used the fact that the support of g is included m. {h — U ,h + U) . 
The claim follows because U < h so that a = 2A is the dominant term in the last sum. □ 

6.3. Restriction. From (6.18), the function g is very small when — min(C//g, il)^ > IDC*. 

Thus, up to a negligible term, we may restrict the m-summation in equation (6.15) to (we 
use the fact that q ^ Q): 

h ri^ f/ 

(6.20) 1 ^ m ^ - X max(— , —r) x |D|'?*. 

U U \V 

Remark 23. Because of (6.12), the right-hand side is always greater than 1. This is consistent 
with (6.15) in the sense that the sum of the RHS certainly cannot be void whatever the choice 
of U and 

6.4. Choice of the parameters. We make now explicit the choice of the initial parameters 
U and rj. Later on, the integer h will be such that |Z)|^~^'' < h < |L'|^+''''. We choose: 

(6.21) U := |D|i-2''4 . Q |i)|i/2-»?4_ 
As a consequence, Q = \D\^^^^'^^; and inequality (6.20) becomes: 

(6.22) l^mi^\D\^'^\ 



a 
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7. Proof of Theorem 1. 

In this section, we establish Theorem 1, making use of results from sections 5 and 6. First 
we state a more general version which was needed in the application to moments of quadratic 
L-functions (Theorem 2, section 4): 

Theorem 3. Let e > 0. There exist an absolute constant A and a real number rj = r]{e) > 
depending on e only such that the following holds. Let f be a modular form of weight 2 and 
odd squarefree level and denote by Xf its normalized Fourier coefficients, see (1.1). Then: 

(7.1) Yl A/(n2-i?e2)«^|Z)|V2-V, 

Af<n<27V 

for all triples {D, e, N) where D is a fundamental negative discriminant whose prime factors 
are all greater than and e and N are positive integers with N < The implied 

constant depends on f only (a polynomial in its level). 

Remark 24. Theorem 1 (where d := —De^) corresponds to the particular case where e = 1 
and D is a prime discriminant. In that case we may choose e = 1 so that 77 > is an absolute 
constant, as claimed. Actually we expect Theorem 3 to hold with an absolute 77 and without 
the constraint on the prime factors of D. 

7.1. Reduction to a smooth version. First we consider the equivalent smooth version 
of (7.1) (see for instance [13, § 4] or [29, § 5.6] for some details on how to "smooth things 
out"). We ought to prove that there exist absolute constants ^ > and s G N as well as a 
real number 774 = 774(e), depending on e only such that 

00 

(7.2) Y,>^fin' + d)V{^) «f |Z)|V2-.4 

n=l ^ ^ 

holds uniformly, where: 
. y GCr((l,2)); 

• D is a negative discriminant whose prime factors greater than \D\''; 

• > is a real number such that N < |Z)|^/^+''4. 

• e ^ 1 is an integer. 

Here ||.|| = \\.\\^ denotes the sup norm. The estimate (7.2) is trivial unless e is a very small 
power of \D\. It is also trivial when N is much smaller than D. Thus we may and do assume 
in the sequel that 

(7.3) IDI^-^* < iV < \D\^+'i^ and 1 ^ e < 

Proof that (7.2) implies (7.1). We choose V € C^((l, 2)) which is 1 on the interval {l+S, 2-5) 
and such that F^*) (absolute constants). Then: 



^X.in^-De^Wi^) 

n=l 



N<n<2N 

|2)|l/2-r?4 .s~'+5N-logN 



+ 6N ■ log N 



We choose <5 = |i:)|-(^+^4)/(i+s)^ ^nd put r/ := 7/4/(1 + 2s): 

\D\^^^-''log\D\. 

In the first line we have made use of Deligne's bound: |Aj(n)| ^ T{n) for all G N^. In the 
second line we have made use of assumption (7.2). □ 
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7.2. Applying the J-symbol method. From now on, denote by S the left-hand side 
of (7.2). To ease notations, put d := —De^. By Proposition 6.2 we have: 



(7.4) 



5 « |A/(m) 



Y^ Yl S{mjr2, + d; q)V(^)g(m; r? + d\ q) 

q^l ^ n=l 



The assumption ll?!^"^* < h < \D\^~^'^'^ from § 6.3 is satisfied because h = n? + d and 
N < n < 2N. From (6.22) we may and do cut the sum S = Si + S2 into two pieces. In 5*2 we 
restrict the summation to m ^ up to a negligible error term: 

(7.5) 5i <A |L>r^''* , for all A >0. 

7.3. Applying Poisson formula. Cancellations in (7.4) arise both from the q and n sums. 
First we apply Poisson summation formula to the rz-sum (the outcome is - roughly speaking 
- that the n-sum occupies the (mod q) residue classes uniformly). Recall that g is zero 
unless q ^ Q = \D\^~'^'^ and that J\f2 = M/{M,q) depends (mildly) on q. 

Lemma 7.1. For each m,q^ 1, we have: 
(7.6) 

n 



00 

n=l 



,ln , 



S{mJ\f2,r? + d;q)g{m;n^ + d;q)V{ — ) = Yh{m;l;q) ^ S{mJ\f2,n^ + d;q)e{ — ), 
where h{m;l;q) is defined below by (7.14) and satisfies: 

Dm. 



Q<i<A 



y(0 



(7.7) /i(m;/;g) <A X ( 
Here C3 = C-i{m, I, q, \D\) is a polynomial in m, I, q and \D\. Furthermore: 



for allA>0,ly^ 0. 



(7.8) 



h{l; m;q) = unless 1 ^ q ^ Q, 



and we have the uniform bound: 
(7.9) 



h{m,l;q) < q-^\D\^/^+'^^ \\V\\ 
h{m; I; q) < g"^|Z)|^+''=^ 11 , for all I e Z and 1 q ^ Q. 



The first derivative satisfies: 

<"°' I 

Proof. By Poisson summation formula we have: 

00 

(7.11) Yl g{m]{n + tqf + d-q) 

t=—oo 

where 

(7.12) h{m;l;q): 
It is clear that (7.8) holds. 



^^e(j^)h{m]l;q) 



g{m; + d; q)V {^)e{-—)dz . 

N q 



^^A puzzling remark is the following. We have explained how to smooth the sum from (7.1) to (7.2). This 
smoothness is necessary to apply Poisson formula to (7.4). If identity (7.4) were in its unsmooth form (i.e. 
N < n < 2N) it would not be possible to smooth it out because inserting the Weil's bound for Kloosterman 
sums in (7.4) would yield a bound much worse that ^ |_D|^^^''"^ (in fact |_D|^''*^'^). This is because we really need 
cancellations in both the q and n sums. In other words it is not possible to reverse the order of transformations. 
First smoothing (§7.1) and then applying (5-symbol (§7.2) is the sole sequence. 
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The estimate (7.7) follows by repeated integration by parts once we know that 

(7.13) ^^g{m; + d; q) «. C3 x {^)-\ 

(because of > \D\^/'^~^'^). Estimate (7.13) follows from the corresponding estimate for g 
and formula (6.16). One needs to differentiate Aq(j){m — d— z^) in the z-variable, and for this 

it is enough to observe that <Ci U~'' and A^*^ <i '^~/^'* * ''iV''"' ''iV'' ' '^^^ 
i € N. 

Consider now estimate (7.10). Inserting the formula (6.16) for g in the definition of h yields: 

(7.14) h{l-m-q) = - / A,(/>(x + b'D - z)V{-)Ji{ ^ )e( )dxdz 

q J -00 Jo ^ g 

The Bessel function satisfies (rough bound, absolute constants): 

(7.15) Ji{z) < (1 + z)-i/2 « 1 ; = ^(Mz) - J2{z)) « 1. 
A bound for Ag and its (7-derivative is given in (6.8). From: 

(7.16) / {q9.+ \x\)-^dx <.\og{q^ + M), 
we deduce that h{l; m; q) is bounded by: 

(7.17) q-'^{n-'^M + log{qn + M)) x iV <C g"^J^~^iVmax(M, 17^) ^ g-ij/^li/s+'ya. 

When introducing the differentiation ^ , we obtain a sum of four terms of the same kind and 
the previous bound get multiplied by 

(7.18) q-^ + q-^ + q-^VM^ + q-^lN < q-^\D\^/^+'^^ 

which yields (7.10). □ 
Remark 25. We have seen in the proof that: 

(7.19) h{m- 1; q) « q-'^\D\^+''^^ \\V\\ . 

(this also follows from (7.10) and (7.8) or might be checked directly from g{l;m;q) <C 
q''^\D\^/'^^'^^). This bound is of the same strength as (7.9) as long as q is near (where 
particular the functions h and g are bounded by an arbitrary small power of |-D|). However 
estimate (7.9) is necessary to tail the g-sum for small g's. This observation is usefull to keep 
track of the estimates during the proof of Theorem 3. 

7.4. End of the proof. From (7.4) and Lemma 7.1 it remains to estimate: 
(7.20) 



(M,A/'2)=1 



h{m;l;q)- ^ S{mJ\f2,n'^ + d; q)e{ — ) 

{q,M2)=l ^ neZ/qZ ^ 

Ml\q 



Recall that the (complete) exponential sum has square-root cancellation, see (5.6) from The- 
orem A. To conclude the proof we appeal to cancellations in the 5-sum. 
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Up to a negligible error term we may restrict the /-sum to |/| < iDp''^ - this is because 
of (7.7). The quantity in absolute values is (see Definition 5.1 and relation (5.3)): 

(7.21) E:= H'm;l;q)J{d,0,mJ^,l;q)=Ei+E2, 

q, Mk 

where Ei contains the terms with q ^ Making use of (7.9) and (5.6) one has: 

(7.22) El < |D|^/2-r?2 ii^^ii ^ 

For the remaining terms we perform an integration by parts (we use the fact that h{m; I; q) 
is zero unless q ^ Q), and utilize (5.8): 

(7.23) E2 = -XDiM3) r ^h{m;l;x)\ V Ji-e'^D,0,mM2,lM2;q)\dx. 

(g,Ar2)=l;A/'iA/'3l<7 

From (7.10), (5.3), and estimate (5.7) from Theorem A we deduce: 

rQ 

E2<. , x'^\D\^+'^''\D\^''^''^UelmN)'^\\V'\\dx 

(7.24) J|D|^-2"2 " " 

< \D\^''^-'^^{elm)^\\V'\\. 
Returning to (7.20), we bound trivially the sums on Mi,m and /. This yields 

(7.25) S2 < II?!^/^-"^ e^dl^ll + ll^'ID 
and concludes the majoration of (7.2) and the proof of Theorem 1. 
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